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Abstract 

Communication systems are traditionally designed to have tight transmitter-receiver synchronization. This 
. . . requirement has negligible overhead in the high-SNR regime. However, in many appUcations, such as wireless 

I sensor networks, communication needs to happen primarily in the energy-efficient regime of low SNR, where 

' requiring tight synchronization can be highly suboptimal. 

I In this paper, we model the noisy channel with synchronization errors as an insertion/deletion/substitution 
. ' channel. For this channel, we propose a new communication scheme that requires only loose transmitter-receiver 
^ I synchronization. We show that the proposed scheme is asymptotically optimal for the Gaussian channel with 
I— ^ ' synchronization errors in terms of energy efficiency as measured by the rate per unit energy. In the process, we 
QQ I also establish that the lack of synchronization causes negligible loss in energy efficiency. We further show that, for 
CS| ' a general discrete memoryless channel with synchronization errors and a general cost function on the input, the 
I rate per unit cost achieved by the proposed scheme is within a factor two of the information-theoretic optimum. 

h: 

^ • I. Introduction 

Traditionally, data transmission in a communication system is based on tight synchronization between 
^ the transmitter and the receiver. This tight synchronization is usually achieved through either of two 
> strategies. In the first strategy, synchronization is achieved through periodic transmission of pilot signals, 
followed by transmission of information over the synchronized channel. In the second strategy, data bits 
are modulated differentially which implicitly achieves tight synchronization. 
\0 The above strategies work well at high signal-to-noise ratios (SNRs) as the energy overhead of achieving 
^ tight synchronization is negligible compared to that of data transmission. However, in many applications, 
^ such as wireless sensor networks, space communication, or in general any communication system requiring 
^ ■ high energy efficiency, communication by necessity has to primarily take place in the low-SNR regime 
(due to the concavity of the power-rate function). In such scenarios, the energy overhead to achieve tight 
synchronization becomes significant and can render the aforementioned strategies highly suboptimal in 
terms of energy efficiency. In fact, it can be shown that requiring tight transmitter-receiver synchronization 
can have arbitrarily large loss in performance in terms of energy efficiency. 

To mitigate this, in this paper, we develop and analyze a framework to perform data transmission while 
only requiring loose synchronization between the transmitter and the receiver. To focus on the energy- 
efficiency aspect, we choose the rate per unit cost (with energy being a prime example of the cost) as our 
performance metric. We model synchronization errors through channel insertions/deletions — an approach 
introduced in [IJ. To motivate this model, consider a transmitter-receiver pair with unsynchronized clocks, 
as illustrated in Fig. [T] Due to the absence of synchronization, the value of the clock at the receiver exhibits 
drift and jitter with respect to the value of the reference clock at the transmitter. This leads to the receiver 
sampling the transmitted signal either faster than the transmitter, leading to channel insertions, or slower, 
leading to channel deletions. 

Before we describe the contributions of this work in more detail in Section II-B[ we provide a brief 
overview of related work on energy-efficient communication and on channels with synchronization errors. 
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input: | 1 | :| 1 | : :| 1 1 |: : o : | i 

output: 00110 001 000 

Fig. 1. An example of unsynchronized transmitter-receiver clocks. The figure plots the value of the receiver clock (y-axis) as a function 
of the value of the reference clock at the transmitter (a;-axis). The drift and jitter of the receiver clock are visible. For a transmitted input 
sequence, the lack of synchronization leads to insertions/deletions in the coiTesponding sampled output sequence at the receiver (illustrated 
here for the case without receiver noise). 

A. Related Work 

It is well known that the capacity per unit energy of a Gaussian channel with noise variance 77^ is 
1/(2?]^ In 2), and that this can be achieved with appropriately designed pulse-position modulation [2]. For 
a general discrete memoryless channel (DMC), [3] has analyzed the reliability function of the rate per 
unit cost. Subsequently, flU has obtained a succinct single-letter characterization for the capacity per unit 
cost of a DMC with a general cost function. These results, however, strongly depend on the channel being 
memoryless. As discussed next, synchronization errors introduce memory in the channel, and thus the 
aforementioned results do not apply. 

The insertion/deletion/substitution channel was introduced by Dobrushin in 1967 [[Tl. Despite significant 
research effort since then, the capacity of this channel is still not known [[5l- lfT2l . Indeed, even for one 
of the simplest versions of this problem, the noiseless binary deletion channel, only loose bounds on the 
capacity are known. For example, the recent paper [6 | provides an approximation of the capacity of the 
binary deletion channel to within a factor 9. The main difficulty in analyzing these channels is due to the 
channel memory introduced by the insertions and deletions, which prevents direct application of standard 
information-theoretic tools. 

It is worth emphasizing that the synchronization errors considered here are those at the symbol level. 
There are other types of synchronization issues. One such issue is frame synchronization, where errors 
arise due to incorrect identification of the location of the "sync word" in the frame [[T3l . Energy-efficient 
communication in the presence of such frame asynchronism has been investigated in [fT4l . 

B. Summary of Results 

In this paper, we consider communication channels which, in addition to synchronization errors, exhibit 
receiver noise. We model the end-to-end communication channel between the transmitter and the receiver 
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Fig. 2. The end-to-end communication channel between an unsynchronized transmitter-receiver pair is modeled by concatenating an insertion/ 
deletion channel IDC(/i, a^) with a discrete memoryless channel DMC{W). 

as the concatenation of two sub-channels, as illustrated in Fig. [21 The first sub-channel is an insertion/ 
deletion channel (IDC), which models synchronization errors. The second sub-channel is a noisy memo- 
ryless channel, which models errors due to the receiver noise. The concatenation of the two channels is 
an insertion/deletion/substitution channel. The details of this model are discussed in Section [III 

We first study communication systems with synchronization errors operating over Gaussian channels. 
We propose a new communication scheme that requires only loose synchronization between the transmitter 
and the receiver. Specifically, the scheme deals with the lack of transmitter-receiver synchronization by 
developing new pulse-position-modulation waveforms, where the signal energy is spread over increasing 
intervals and guard spaces of increasing lengths are introduced. Decoding at the receiver is based on a 
sequence of independent hypothesis tests. When the aforementioned durations are chosen appropriately, 
we show that the scheme asymptotically achieves the information-theoretically optimal performance in 
terms of the energy efficiency, i.e., the capacity per unit energy. In the process, we also establish that the 
lack of transmitter-receiver synchronization causes negligible loss in terms of energy efficiency. 

We then analyze communication systems with synchronization errors operating over general DMCs 
and with a general cost function. We generalize the proposed achievable scheme for the Gaussian case to 
DMCs, and we show that the scheme achieves a rate per unit cost within a factor two of the information 
theoretic optimum. Thus, while only loose bounds are known for the capacity of the general insertion/ 
deletion/substitution channel, we provide here a tight approximation for its capacity per unit cost. To 
establish this, we obtain an upper bound on the capacity per unit cost of the channel in Fig. [2] by 
considering the effect of the IDC as a specific way of encoding for the DMC with an appropriately 
modified cost function. The upper bound is then obtained by utilizing the characterization of the capacity 
per unit cost for memoryless channels in [4J. 

C. Organization 

The remainder of the paper is organized as follows. Section provides the detailed description of the 
channel model and the problem formulation. The main results of the paper are summarized in Section Hill 
Section HVl describes the proposed scheme for a general discrete memoryless channel with synchronization 
errors. Section IVl derives an upper bound on its capacity per unit cost. Section |VIl discusses the proposed 
scheme for the Gaussian channel with synchronization errors and establishes its asymptotic optimality. 
Lastly, Section rvTll analyzes the performance of the scheme for the Gaussian channels with synchronization 
errors where even the statistical properties of errors are not known precisely a priori (i.e., the compound 
setting). 

II. Channel Model and Problem Statement 

We consider an insertion/deletion/substitution channel with a cost constraint. The insertion/deletion/ 
substitution channel consists of an IDC connected to a DMC as shown in Fig. [21 in Section [B The insertion/ 
deletion part of the channel models synchronization errors (see Fig. [Hin Section [J), the substitution part 
models noise. 

The IDC maps the input sequence (a;[l], . . . , x\T]) E to the output sequence (x[l], . . . , x[L]) G for 
some random length L, where here and in the following we use sans-serif font to denote random variables. 
The actions of the IDC are governed by the i.i.d. sequence of states (s[l], . . . , s[T]) G {0, 1, 2, . . . }-^. State 
s[t\ describes how many times input symbol x[t] appears at the output of the IDC. 
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Fomially, the total number of output bits is given by 

t=i 

Observe that L is a random variable depending on the state sequence of the IDC. Define for each £ G 
{1, 2, . . . , L} the random variable 

t 

t[£]^min|t:5^s[j]>£}. 

i=i 

The relationship between the input and output of the IDC is then given by 

We illustrate the operation of the IDC with an example. 
Example 1. 

X = x[l] x[2] x[3] x[A] x[5] x[Q] 
s = 1 1 2 1 2 
t=1233466 
X = x[l] x[2] x[3] x[3] x[A] x[6] x[Q] 

In the example, T = 6 and L = 7. Here x is the vector of inputs and x is the vector of outputs of the IDC. 
s is the vector of states and the corresponding vector of sampling times is t. For concreteness, assume 



X 



(001010). 



Then the output of the IDC is 



X = (0 1 1 0) , 
see also Fig. [T] in Section HI 
We denote by 



and 



/^ = E(s[l]) 



a' = var(s[l]) 



the mean and variance of the insertion/deletion process, respectively, and we refer to any IDC with those 
parameters as IDC(/i, o"^). Here, fi and cr^ can be interpreted as capturing the drift and jitter of the receiver 
clock, respectively. In most situations arising in practice, the parameter ^ is close to 1, e.g., ^ = 1± 10^^. 

Our results will be presented for arbitrary insertion/deletion processes with finite mean and variance. 
For illustrative purposes, we present a commonly used special case of this setting. 



Example 2. One commonly used definition of the state process is 

S[t] : 



1, w.p. 1 — d, 
0, w.p. d, 



for some parameter d. This results in the so-called deletion channel, which deletes each input symbol 
with probability d. 

The output of the IDC(/i, cr^) is then fed into a discrete memoryless channel DMC(W^) described by 
the distribution of the channel output y E y given the channel input x E X. The insertion/ 

deletion/substitution channel is the (random) mapping from a; to y described by the concatenation of the 
IDC(/i,a2) and the DMC(M/). 
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The goal is to maximize the number of bits reliably transmitted per unit cost over this insertion/deletion/ 
substitution channel formed by the concatenation of the IDC(/i, a^) with the DMC{W). We adopt the 
framework of OSj, [|4J. The cost function c: X ^ R_|_ associates with each input symbol x E X the cost 
c{x) incurred by transmitting x over the channel. We make the assumption that X contains a free input 
symbol, and without loss of generality we label this symbol as 0. In other words, Q E X and c(0) = 0. 
For an input sequence . . . , x[T]) G X'^ , the cost is given by 

c{{x[ii---.m)) = Y.c{x[t]). 

t=i 

A (T, M, P, e) code consists of M codewords 

(x^[l],...,x^[T]), mG {1,2,...,M} 

each of length T and cost at most P, with average (assuming equiprobable messages) probability of 
decoding error at most e. 

Definition. A rate R per unit cost is achievable if for every e > and every large enough M there 
exists a {T, M, P,e) code satisfying^ log(M)/P > R. The capacity per unit cost C is the supremum of 
achievable rates per unit cost. 

Throughout this paper, we are interested in C{^,a'^ ,W), the capacity per unit cost of the insertion/ 
deletion/substitution channel given by the IDC(yU, o"^) concatenated with the DMC(iy). We also consider 
the compound capacity per unit cost C{[^i, /i2], [a^, cxg], W), for which the encoder and decoder have to 
be able to operate on any IDC(;U, cr) with fi G [/ii,/X2] and a G [crf,(T|] without knowledge of the actual 
values of ji and cr^. This compound setting is of practical relevance, since usually the mean clock drift /i 
is only specified as an interval (and might indeed be slowly time varying) and is hence not known exactly 
at the transmitter or the receiver. We also treat the Gaussian version of the problem, where the output of 
the IDC(/i, 0"^) is subject to additive Gaussian noise of mean zero and variance rf. The cost function is 
in this case the signal energy, i.e., c(x) = x"^. With slight abuse of notation, we refer to the capacity per 
unit energy in this case as C(fi,a'^,J\f{0,r]'^)). 



III. Main Results 

In this section, we summarize the main results; their proofs are discussed in subsequent sections. We 
start with the results for Gaussian channels with synchronization errors for the case where the statistics 
ji and cr^ of the insertion/deletion channel are known at the transmitter and the receiver. 

Theorem 1. The Gaussian channel with synchronization errors having insertion/deletion process with 
mean fi and variance cr^ and having noise power rf has capacity per unit energy 

^■(''•-'••^(''•"^» = 2?b- 

Recall that the capacity per unit energy of the Gaussian channel is \/{2rf\\i2). Furthermore, as 
discussed in Section |IIl the mean fi of the insertion/deletion process is typically close to 1. Hence, 
Theorem [H implies that the lack of synchronization results in only negligible loss in the capacity per unit 
energy. 

To establish achievability, we propose a new communication scheme that jointly performs data modu- 
lation and loose synchronization. To this end, we develop signaling waveforms where the signal energy 
is spread over increasing intervals and guard spaces of increasing lengths are introduced. Decoding at the 
receiver is based on a sequence of independent hypothesis tests, which are carefully chosen to account 
for the uncertainty arising due to the lack of tight synchronization. In Section |VIl we show that, by 



'Throughout this paper, log(-) and In(-) denote the logarithms to the base 2 and e, respectively. 
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appropriately choosing the aforementioned durations, the probability of error can be made arbitrarily 
small for any rate per unit energy up to C . The details of the analysis of this scheme are reported in 
Section |VIl The upper bound in Theorem [T] follows as a special case of the upper bound derived for a 
general DMC with synchronization errors, discussed in Theorem |3] below. 

Next, consider the case where the exact statistical properties of the insertion/deletion process are not 
known a priori. Instead, we only know a range for each parameter, i.e., the mean ji is in [/ii,/i2] and 
the variance in [0, cr^]. We are interested in a communication scheme that works simultaneously for every 
possible set of parameters in this range. As pointed out in Section |Ill this compound setting is of practical 
relevance, since the precision of the transmitter and receiver clocks are usually only known to lie within 
some range. 

Theorem 2. The class of Gaussian channels with synchronization errors having insertion/deletion process 
with mean jj, G [/Ui,/U2] and variance upper bounded by cr^ and having noise power rf has compound 
capacity per unit energy 

C'([/xi,/i2],[0,a2],Ar(0,r^2))- 



2r/2 In 2 



Comparing Theorems [T] and |2l we see that 



Since a scheme for the compound setting must work for any possible value of fi and (x^ of the insertion/ 
deletion process, it is clear that it must work for the worst one, so that 



/^e [mi, 1*2] ' ' ' 2rf\n2' 

Theorem |2] thus shows that there is no further loss beyond this resulting from the lack of precise knowledge 
of the insertion/deletion statistics at the transmitter and the receiver. The proof of Theorem |2]is presented 
in Section IVIIl 

Finally, consider an insertion/deletion channel IDC(/i, cr^) concatenated with a general discrete memo- 
ryless channel DMC(W^) specified by its transition probability matrix W : X ^ y. Furthermore, consider 
an arbitrary cost function c: X ^ R_|_. As mentioned in Section [III we assume that G A" and c(0) = 0. 
Then, the following bounds hold on the capacity per unit cost. 

Theorem 3. The insertion/deletion/substitution channel consisting of an IDC{fi,a'^) concatenated with a 
DMC(W) has capacity per unit cost C{fi, cr"^, W) satisfying 

where D{P\\Q) is the Kullback-Leibler divergence between distributions P and Q and where Q & X is 
an input symbol with zero cost. 

Theorem |3] approximates the capacity per unit cost of a general DMC with synchronization errors 
and general cost function to within a factor two. In contrast, recall from Section Hthat for the capacity, 
even of the noiseless deletion channel, only loose bounds are known despite over four decades since the 
introduction of the model in [IJ. 

It was shown in [4J that the capacity per unit cost C{W) of a DMC(iy) is 

^ Dm-\fwm) 

xex\{o} c[x) 

Thus, from the lower and upper bounds in Theorem |3] we obtain the following corollary, showing that 
the loss due to synchronization errors is within a factor between /i/2 and /x. 
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Corollary 4. The insertion/deletion/substitution channel consisting of an IDC{^, o"^) concatenated with a 
DMC(W) has capacity per unit cost C{fi, a^, W) satisfying 

^CiW)<Cifi,a\W)<fiCiW). 

The achievability in Theorem |3] is established by generalizing the proposed scheme for the Gaussian 
channel with synchronization errors to DMCs. The details are discussed in Section |IVl For the upper 
bound on the capacity per unit cost in Theorem |3l we treat the effect of the IDC as a specific way of 
encoding for the DMC with an appropriately modified cost function. The upper bound is then obtained 
by utilizing the characterization of the capacity per unit cost for memoryless channels in [141. Section IVl 
provides the details. 

We conclude this section by demonstrating through an example that the conventional schemes based 
on tight synchronization between the transmitter and the receiver can be highly suboptimal in terms of 
their rate per unit cost. 

Example 3. Let us consider the simplest synchronized communication setting: a channel with binary 
input and no noise, i.e., iy(x|a;) = 1 for x E {0, 1}. Further, let the cost function be the number of ones 
transmitted, i.e., c(x) = x. This is a DMC, and by O, its capacity per unit cost is 

c(l) 

Let us now consider the scenario where the transmitter and the receiver are no longer perfectly syn- 
chronized. Specifically, the input signals are first corrupted by a deletion channel with deletion probability 
d G (0, 1) (see Example [2] in Section |II] for a formal definition of this special case of an IDC), before 
being sent over the aforementioned noiseless channel W. 

Consider first the operation of conventional schemes based on tight synchronization. In this example, 
we take this to mean any scheme that detects and corrects deletions without letting them accumulate. 
This definition applies to schemes using pilots as well as to schemes using differential modulation. To 
maintain tight synchronization, the channel inputs cannot contain too many consecutive zeros (since 
otherwise deletions would accumulate without any way of correcting for them). On average, we expect 
to see about one deleted bit every 1/d transmitted bits. Thus, roughly every 1/d channel inputs needs to 
be a 1 at a cost of c(l) = L Now, over a block of 1/c? bits, we can reliably transmit at most 1/d bits. 
Hence, the rate per unit cost achieved by any scheme based on tight synchronization is at most 

Rsyncid, W) <^<00. 

On the other hand, from Corollary 5] the communication scheme proposed in this paper achieves a 
rate per unit cost that is within a factor of fi/2 = (1 — d)/2 of the capacity per unit cost C{W) of the 
underlying DMC(W). Hence, the capacity per unit cost with synchronization errors is 

C{d, W) > ^^^C{W) = oo. 

Thus, even in the presence of synchronization errors, the rate per unit cost achieved by the proposed 
scheme is arbitrarily large. This illustrates that the improvement in the rate per unit cost achieved by the 
proposed scheme over schemes based on tight synchronization can be unbounded. 

IV. Proof of Lower Bound in Theorem [3] 

In this section, we propose a coding scheme that achieves the lower bound on the rate per unit cost 
stated in Theorem |3] for a general DMC with synchronization errors. The scheme uses a type of pulse- 
position modulation. To send message m, the encoder sends a burst of symbols x* E X\ {0} at a position 
corresponding to this message. The decoder searches for the location of the pulse using a sliding window. 
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Once the pulse is located, the decoder checks which decision region it is in and declares the corresponding 

message. In order to deal with insertions and deletions, guard spaces need to be introduced around the 
pulses and the decision regions need to be chosen judiciously. We proceed with a detailed description of 
the scheme and its analysis. 

Encoding: Fix a target error probability e e (0, 1) and a number S e (0, 1). Let x* e X \ {0} be a 
fixed nonzero channel input. The codeword for message m e {1, 2, . . . , M} is 



— {ON{m-l), X* ■ 1b, On-B, Ojv(M-m)) , 



where 

N ^ [36MaV(/x2£)] = e(M), 

and 

{2 + 5)l0g{M) _r.n.j^.. 

Thus, to communicate message m, the transmitter sends a pulse of symbols x* at position N{m — 1) + 1 
and of duration B. Observe that between adjacent possible pulse positions is a guard space of N — B 
zeroes. The block length of this code is T = MN and the cost of each codeword is 

P^Bc(x*). (2) 

Decoding: Recall that the output y of the channel has length L. The decoder forms the subsequences 

y. = (yM,y[^ + i],...,y[^+L5/i-/3J-i]) 

of length [Bn - /3J for £ e {1, 2, . . . , L - [B^ - /3J + 1} with 

/3 ^ ,/4Baye = e(log^/'(M)). 

Similarly, define the subsequences 

x,^ (x[£],x[£ + l],...,x[£+ [5/X-/3J -1]) 

of the output of the IDC / input to the DMC (not observable at the receiver). Finally, define the decision 
regions 

{1}, for m = 1 

N n ((m - l)Nii + l-u,{m- l)Nn + 1 + i^), for m e {2, . . . , M} 

with 



V = 



u = y/4:MNa^/e = e(M). 



In words, the decision region T>m for message m consists of all integer points within distance u of 

(m- 1)Nijl + 1. 

The receiver performs independent hypothesis tests for each for the two hypotheses 

H' 4 {x, = 0}, 
^ {xe = X* ■ 1}. 

Let be the decision of the hypothesis test for y^. The receiver declares that message m was sent if 
= for some £ e "D^ and = for all i e P^/ with m' ^ m. If no such m exists, an error is 
declared. 



9 



In order for the decoder to be well defined, we need to ensure that the decision regions are disjoint, 
i.e., that Dm H Vm' = for m ^ m'. This is the case since, by the definitions of and 

/36Ma2 
iV/i > VNJ — /i 



(3) 
so that 

mNfi + 1 - z/ > (m - l)Nfi + 1 + u 

for all m. 

Error Analysis: Assume that message m was sent. Let Si^i be the event that = and 82/ be the 
event that V\e = H^. Define the missed-detection event 




^1 = n 



and the false- alarm event 

^ U U 

The probability of decoding error for message m is equal to P,„(^^i ^ £2), where Pm denotes probability 
conditioned on message m being sent. 

We continue by upper bounding this probability. It will be convenient to define two auxiliary events 
isolating the behavior of the IDC. Let 82, be the event that the total number of symbols in x resulting 
from the first (m — 1)A^ transmitted symbols is outside the interval ((m — l)Nii — z/, (m — l)Nii + z/), 
and let £4 be the event that the number of symbols in x resulting from symbols transmitted during time 
slots (m - l)N + 1 to (m - l)N + B h outside the interval [Bji - f3,Bfi + f3). We have 

p^(^i u £2) = P™(^i u ^2 1 ^3 u £4)P™(^3 u ^4) + P™(^i u £2 1 £3 n ^4^)P™(^3^ n £'^) 

< P„(^3 u £4) + P™(^i u £2 1 £'3 n £2) 

< ¥^(£3) + Pm(^4) + P™(^1 I £3 n £d + ^mi£2 \ £3 H £d- (4) 

The first two probabilities correspond to the events that the IDC(/i, o"^) is not well behaved and the last 
two correspond to the two possible detection errors caused by the DMC(iy) conditioned on the behavior 
of the IDC to be as expected. 

We continue by upper bounding each of the terms in dD in turn. By Chebyshev's inequality, 

P„^(^3) = MKti'^^^t] -{m- l)iV^| > u) 



< 



Z/2 



and 



< e/A, (5) 



Ba^ 
< 

= e/A, (6) 
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where we have used the definitions of v and /3, respectively. 

We proceed with the analysis of Pm(^i|^3n£^4) and m{^2\^z^^t) ■ 'The following is the key observation 
for this analysis. Conditioned on message m being sent and on Ei^nE^, the elements in the decision regions 
satisfy the following two properties for M large enough (not depending on m): 

1) For every i E Vm' with m' ^ m, we have X£ = 0. Hence, the symbols in the subsequence are 
i.i.d. with distribution Vr(-|0). 

2) There exists at least one i e such that = x ■ 1. Hence, the symbols in the subsequence 
are i.i.d. with distribution 

We start by proving property 1 . By construction of the codewords, and since the IDC part of the channel 
can only delete and duplicate symbols but never "create" them, we only need to argue that the burst of 
symbol x* sent in block m by the transmitter cannot be shifted into the decision region Vm'- 

Assume first m' < m. The right-most element of is at position less than or equal to (m — 2)Nfj. + 
1 + u, and therefore the right-most element of with i E V„i' is at position at most 

{m-2)Nfi + u + Bfi. 

Now, conditioned on there are at least (m — l)Nn — v symbols in x before the first symbol x*. For 
there to be no overlap, it is sufficient to argue that 

(m - 2)A^/i + u + Bfi< (m - l)Nfi - z/, 

or, equivalently, that 

iV/i > 2u + Bfi. 

This holds for M large enough since we have iV/i > Su by ([3]), and since u = 9(M) whereas B = 

e(iog(M)). 

Assume then that m' > m. The left-most element of any xi with i E Vm' is at position at least 
mNfx + 1 — u. Conditioned on 8^, there are at most (m — l)A^yU + u symbols before the first symbol 
X* in X. Conditioned on 81, the burst of symbol x* in x is of length at most Bjj, + (3. For there to be no 
overlap, it is sufficient to argue that 

(m - l)NjjL + z/ + S/i + /3 < mNji + 1 - 

or, equivalently, that 

Nfi>2u + Bfi + p. 

This holds for M large enough by the same argument as in the last paragraph since /3 = 9(log^^^(M)). 
Together, this proves property 1. 

To prove property 2, observe that, conditioned on 8^, the burst of symbols x* in x is of length at least 
Bji — (3. Further, conditioned on 8^, this burst must start at the receiver in the interval 

N n ((m - l)A^/i + 1 - z/, (m - l)iV/i + 1 + z/) = P^. 

Since the subsequences x^ have length \_B^ — /3J , these two statements show that there exists at least one 
I E Vm such that xi = x* -1. 

The two properties allow us to analyze the events 8i^^ and 82/- Recall that the hypothesis test on y^ is 
performed under the assumption that either x^ = or x^ = • 1. Fix a threshold for the hypothesis test 
of y^ such that the probability of missed detection satisfies 

P(H^ = /fO |x^ = x*-l) <e/4. (7) 

By Stein's lemma (see, e.g., fTS", Theorem 12.8.1]), we then have that the probability of false alarm of 
the optimal test is upper bounded by 

p(H^ = i^i I = 0) < 2-LBm-/3J£'(w^(-I^*)I|W'(-|o))+o(Bm-/3) 

^ 2-BiiD{W{-\x*)\\wm)+o{\og{M)) .gx 
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as M — 7- oo, and where we have used that B = e(log(M)) and (3 = Qilog^^^M)). 

Consider then the value of £ G Vm guaranteed by property 2. For this i, we have by (|7]), 

= = H'^ \xe = X* -1) 

< e/i. (9) 
By property 1 and ([8]), and using that \Vi\ < \V2\ = {V^l = {V^l = . . . , 



Now, 

\V2\ < 2z/ + 1 < 0(M) 
as M — > oo. Hence, using the definition of B, 

^m{£2 I £3 (^£4) < 22^°s(^)--^^-^(^(-'^*)ll^(-|°))+°('°s(M)) 
^ 2-Slog{M)+o{log{M)) 

< e/A (10) 

for M large enough. 

Substituting Q, and (flOl) into dH) shows that for M large enough the probability of decoding 

error is upper bounded by e for every message m. By the achievable rate per unit cost of this scheme 
is 

log(M) 



R 



> 



P 

log(M) 
Bc{x*) 
/i D{W{-\x*)\\W{-\0)) 



2 + 6 c(x*) 

Since 5 > can be made arbitrarily small, this shows that 

^^f^ DiWi-\x^)\\Wm) 
- 2 c{x*) 

Taking the supremum over all x* E X \ {0} completes the proof of the lower bound in Theorem |3] ■ 

Remark: In the presence of perfect synchronization, the decoder knows exactly where the possible 
pulses are located, and thus needs to check only M possible pulse positions. However, in the proposed 
scheme, we need to check possible pulse positions due to the lack of synchronization. It is this 
increase from M to hypothesis tests that results in the reduction of rate per unit cost by a factor 2 
compared to the synchronized case. 
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Fig. 3. The behavior of the original insertion/deletion/substitution channel (top figure) model can be simulated over the discrete memoryless 
channel W by modifying the encoder and the cost function (bottom figure). 



V. Proof OF Upper Bound IN Theorem [3] 

In this section, we provide an upper bound on the capacity per unit cost of the insertion/deletion/ 
substitution channel. Since the IDC part of the channel is not memoryless, standard converse techniques 
are not applicable in this setting. Instead, we use a simulation argument, namely that the insertion/ 
deletion/substitution channel can be simulated with an encoder and decoder communicating over a discrete 
memoryless channel. This DMC, in turn, can be analyzed and yields the desired upper bound for capacity 
per unit cost of the insertion/deletion/substitution channel. We now provide the details of this argument. 

Let and {y^^j^Q be an encoder-decoder pair achieving rate per unit cost C{fi, cr^, W) — 5 and average 
probability of error at most e for the insertion/deletion/substitution channel. Note that, since the output 
of the channel is of random length L, the decoder consists of several sub-decoders {v^^j^i, one for each 
possible realization £ of L. 

We want to argue that the statistical behavior between the input m to the encoder and the output m 
of the decoder cp^ can be simulated over the discrete memoryless channel W alone (see Fig. [3]). Consider 
a new encoder f'^ that consists of the concatenation of with an IDC of same statistical behavior as 
the one in the original insertion/deletion/substitution channel. Denote by s'[i] the state random variables 
describing this IDC. Observe that the encoder f'^ is a randomized, variable-length encoder, mapping the 
message m into a random sequence x'"-' of random length L'. 

The output x'"- of the encoder f'^ is transmitted over a DMC with the same transition probability 
matrix W as in the original insertion/deletion/substitution channel. Let y''- be the output of this DMC. 
The decoder for the DMC is equal to y?'- . Observe that this is a variable-length decoder, and denote 
by rri' its output. 

By construction, for the same message m, the distributions of ifi and m' are identical. In particular, the 
average probability of error of both systems is the same. Hence, the average probability of error of f'^ 

and {v^'^j^o ^'^^^ ^ most e. 
Define the new cost function 

c'(-) ^ -c(-) 

for the simulating DMC. With respect to this cost function, and assuming a uniformly distributed message 
m, the expected cost of using the variable-length encoder f'^ over W is 

T 

= iE(5^s'[t]c(x'[t]) 
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^ t=i 

= Ec(x^). 

Here, (a) follows from the definition of c'(-); (b) follows from the fact that the insertion/deletion process 
{s'[t]} is identically distributed and independent of the channel inputs; and (c) follows since s'[t] and s[t] 
have the same distribution, and since x'[t] and x[t] have the same distribution. 

Hence, encoder f ^ used over the DMC(W^) has the same expected cost with respect to the cost function 
c'(-) as the encoder used over the insertion/deletion/substitution channel with respect to the cost function 
c(-). Observe that, while the two encoders have the same expected cost, the encoder satisfies the stronger 
per-codeword cost constraint, whereas the encoder f'^ does not. 

The arguments in the last two paragraphs show that there exists a randomized variable-length encoder 
f'^ and variable-length decoder {(y^'^j^Q achieving a rate per average unit cost of 

w _ log(M) 
Ec'(x'L') 

_ log(M) 
Ec(x^) 

and average probability of error at most e. Since this is just one possible coding scheme, as M — )• oo, /?' 
must be upper bounded by C'{W), the capacity per unit cost of the DMC(W^) subject to expected cost 
constraint, and allowing randomized variable-length codes. Thus, letting 5 — )• 0, 

C{ii,a^,W) <C'{W). (11) 

It remains to analyze C'{W). By lfT6l Exercise 6.28], we haveH 

/(x'; y) 



C'iW) 



max ■ 



Ec'(x') 

Furthermore, by [3] Theorems 2 and 3], 

/(x^y) D{w{-\x')\\w{-m 

max — — -— = sup — ^ 

X' Ec'(x') c'(x') 

DiW{-\x)\\W{-\Q)) 
= 11 sup . 

xeA-\{o} c[x) 

so that 



cw) = . sup ^m-mwm)^ 

x€X\{0} C[X) 



Combining (fTTI) and (fT2l) shows that 



C{n,a ,W) < ji sup 

xex\{o} c[x) 

completing the proof. 



^The statement of the exercise is for capacity per unit cost for DMCs under expected cost constraint and with deterministic variable-length 
codes. However, the same proof outlined there also holds for randomized encoders. 
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VI. Proof of Theorem [H 
The upper bound in Theorem [T] follows from the upper bound for DMCs. Indeed, by Corollary |4l 



C'(/i,a^A^(0,r^^)) <^C'(Ar(0,r/2)) 



2r/2 ln(2)' 

yielding the desired upper bound. 

For the lower bound, we adapt the achievable scheme for the DMC described in Section |IV] to the 
Gaussian case. For simplicity, we consider the case of noise power 77^ = 1 and point out in the end how 
to extend the result for arbitrary values of rf. 

Encoding: Fix a target error probability e G (0, 1) and a number 5 G (0, 1). Let x* = x*{M) > be 
a nonzero channel input. Unlike the DMC case, we will choose x*{M) as a function of M here. The 
codeword for message m G {1, 2, . . . , M} is again given by 

with the same 

N ^ [36MfrV(/i'£)l = e(M) 
as before. However, here we choose the burst length B as 

B ^ [^MNayfi^ J = e(M), 

as opposed to 6(log(M)) in the DMC case. It can be verified that B < N, and hence the codewords are 
well defined. The block length of this code is T = MN and the cost of each codeword is 

p = B{x*y. (13) 

Decoding: Consider again the subsequences 

y,^(y[£],y[£+l],...,y[£+L5/i-/3J-l]) 

and 

x,^ (x[£],x[£ + l],...,x[£+ [Bfi-I3\-1]) 
of length [Bfi - /3\ for ^ G {1, 2, . . . , L - [B^i - f3\+l} with 

/3 = ^ABa^/e = Q{y/M). 

Define the decision regions 

{{1}, for m = 1 

([M/log(M)J ■N)n((m-l)A^/i + l-z/, (m-l)A^/i + l + z/), for m G {2,...,M} 



T> 



with 



z/ = viMiVoV^ = 0(M). 



In words, the decision regions consist of {1} (for m = 1) or of all multiples of [M/log(M)J between 
(m — l)NjjL + 1 — u and (m — l)Nfi + 1 + z/ (for m > 1). This differs from the DMC case, where the 
boundaries of the decision regions are the same, but there the regions contain every integer between them. 
Using the same arguments as in the DMC case shows that these decision regions are disjoint. 
The receiver independently performs the hypothesis test 

^ :(y„l)|V(2 + 5) ln(M) 
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for each I e V^, m e {1, . . . , M}, and where (•, •) denotes the inner product. Let be the decision of 
the hypothesis test for y^. As in the DMC case, the receiver declares that message m was sent if = 
for some i G T>m and = for all i E V^' with m' 7^ m. If no such m exists, an error is declared. 

Error Analysis: Assume that message m was sent. We define the same events as in the DMC case. Let 
£-i^(, be the event that = H^, and £2,^ be the event that = H^. Define the missed-detection event 

^1 = n ^1-^ 

and the false- alarm event 

^2 ^ u U 

The probability of decoding error for message m is then equal to Pm(^i U E2), where again ¥m denotes 
probability conditioned on message m being sent. 

We again define the two auxiliary events describing the behavior of the IDC. Let £3 be the event 
that the total number of symbols in x resulting from the first {m — 1)N transmitted symbols is outside 

((m — l)Nii — u,{m — l)Nn + u), and let £4 be the event that the number of symbols in x resulting from 
symbols transmitted during time slots {m — 1)N + 1 to {m — 1)N + B is outside {Bfj, — /3, Bfj, + j5). 
Using the same argument as for the DMC case, we can upper bound 

Pm(^i u S2) < f^{S^) + P^(£4) + Pm(^i I ^3 n SI) + ¥^{82 1 £3' n SI). (14) 

Using Chebyshev's inequality as in the analysis of the DMC case, we obtain 

Pm(^3) < e/A (15) 

and 

Pn^(^4) < el A. (16) 

We proceed with the analysis of '¥jn{Si\8^r\El) and Pm(£2|^^3 H^l). Conditioned on message m being 
sent and on £| fi £1, the elements in the decision regions satisfy the following two properties for M large 
enough (not depending on m): 

1) For every i e V^i with m' ^ m, we have (x^, 1) = 0. Hence, 

is Gaussian with mean zero and variance one. 

2) There exists at least one ^ e Dm such that 

(x,, 1) > x^{[B^x - /3J - M/log(M)). 

Hence, 

is Gaussian with mean at least 

M 



X* 



VL5/^-^j(i- 



[_B^^-|3\ log(M), 
and variance one. 

The first property follows by the same arguments as for the DMC case, using that B < z//(2/i) and 
/3 — o{B{M)) as M -^00. For the second property, note that by the arguments for the DMC case, there 
exists at least one £' e {1} for m = 1 or f e N n ((m - l)Nn + 1 - i/, (m - l)Nn + 1 + i/) for m > 1 
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such that X£i = x* ■ 1. However, this value of i' may not be a multiple of [M/ log(M)J, and hence may 
not be an element of Vm- Let £ be the closest multiple of [M/ log(M)J to i' that is in V„i', such a i 
exists for M large enough since u = 6(M). Since \i — i'\ < M/ log(M), this implies that 

(x,, 1) > (x,,, 1) - x^M/log(M) = x*{lBfi -f3\- M/log(M)), 

as required. 

The two properties allow us to analyze the events Si and £2. By property 1, and using that \Vi\ < 

1 SI n S2) < J2 Yl I s^ n S',) 



SI n SI 



< M\V2\Q{^/{2 + 6) ln(M)). 
Using the Chemoff bound Q{a) < exp(— for the Q-function, we have 

Q{y^{2 + 6) ln(M)) < M-^'+^^^\ 

Moreover, since u = 0(M), 
as M — )■ 00. Hence, 

FU£2 I SI n ^4^) < 0{M''/' log(M)) < e/4 (17) 

for M large enough. 

Consider then the value of £ G Vm guaranteed by property 2. For this £, 



FU£i\sinS',)<FUSi,e\sins^ 



a/L5^-/3J 



Recall that 5 = e(M) and /3 = e(VM), so that 

^y^\{^ - iB,-nio,(M) ) = - 

as M — 00. By choosing 

x*^{l + (5)7(2 + 5) ln(M)/(5/i), (18) 

we obtain 

P™(£i I n £4^) < g(((5 - o(l))v/(2 + 5) ln(M)j < 5/4 (19) 

for M large enough. 

Substituting ([H]), ([Ml), ([HI), and ([191) into (fT4l) shows that for M large enough the probability of 
decoding error is upper bounded by e for every message m. By (fT3l) and ([TSl) . The power required by 
this scheme is 

P = B{x*f 

= {l + 5f{2 + 5) ln(M)//i. 
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input: I \ I 

1 Bi ; N2 



output {fJ,i): I I I 

1 niBi tiiN2 Hi(N2 + B2) 



output (/i2): 

1 ^i2-Bl M2A^2 

Fig. 4. Construction of input waveforms for the Gaussian compound insertion/deletion/substitution channel. In the figure, fj,\ = 1/2 and 
/i2 = 2. For simplicity, 5 is set to 0. A'^2 is chosen such that yLiN2 ~ A'2-Bi, ensuring that pulses corresponding to different messages 
(indicated by the dotted lines) are nonoverlapping at the receiver for all values of /i £ [111,122] and under expected behavior of the IDC. 



N2 + B2 



^2(A'2 +B2) 



Hence, the achievable rate per unit cost for this scheme is 

A log(M) 



> 



1 + 5)2(2 + 5) ln(2)' 
Since 5 > can be made arbitrarily small, this shows that, for noise power t]"^ = 1, 

- 21n(2) 

Assume then that i]'^ ^ 1. By scaling the channel input at the transmitter by a factor r] and the channel 
output at the receiver by a factor l/r], we can transform the channel to one with unit noise power. Since 
this increases the energy of the transmitted symbols by a factor if , but does not change the probability 
of error, this shows that 

- 2r/2 1n(2)' 

concluding the proof. ■ 

VII. Proof of Theorem [2] 

This section adapts the coding scheme for the Gaussian insertion/deletion/substitution channel with 
known value of [i described in Section |Vl] to the compound setting with ^ only known to be in the range 
[/ii,/i2]- As before, we will first assume that the noise power is 77^ = 1 and then generalize the result for 
arbitrary values of if . 

Encoding: Fix a target error probability e E (0, 1) and a number 5 E (0,/ii). Let x* = x*{M) > be 
a nonzero channel input. The codeword for message m G {1, 2, . . . , M} is 



where 



0, if m = 1 

[S|(A^m-i + 5^-i)l, ifm>l 



and 



R ^ 



[log(M)J, ifm = l 

[(/i2-/ii + 25)A^„J, ifm>l. 
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— 1/2 

Observe that here, unlike the case with known /i, the value of the nonzero channel input is x*B,n , 
which depends on the message m. This construction is illustrated in Fig. |4] The block length of this code 
is T = Nm + Bm, and the cost of codeword m is 

P = B^{x^B;^^/'f = {x^'f. (20) 

Decoding: Define the decision regions 



{1}, for 171 = 1 

( [NJ log(M)J ■ N) n ((/ii - 5)iV^ + 1, (/i2 + 5)Nrn + 1), for m G {2, . . . , M}. 



Note that, unlike the case with known value of /i, the decision regions here are increasing as a function 
of m. However, each decision region contains approximately the same number (/i2 — /xi + 25) log(M) of 
points. It is easy to verify that the decision regions are disjoint. 
For i E Dm, define the subsequences 

= (yW, y[^ + 1], . . . , y[£ + [(/^i - 6)b^\ - 1]) 

and 

X, ^ {x[i],x[i + 1], . . . , + L(/il - 5)S,nJ - 1]) 

of length [(/ii — 6)Bm\ ■ We point out that here, unlike the case with known value of (u,, the subsequences 
in different regions Dm and Dm' have different lengths. 
The receiver independently performs the hypothesis test 

^ (y„ 1) % V(2 + 5)ln(M) 



for each i E Vm, m E {I, . . . , M}, and where (-, ■) denotes the inner product. Let be the decision of 
the hypothesis test for y^. As in the case of known fi, the receiver declares that message m was sent if 
Hi = for some £ E Vm and = for all i E Vm' with m' ^ m. If no such m exists, an error is 
declared. 

Error Analysis: Assume that message m was sent. We define the same error events as in the case of 
known /i. Let Ei^t be the event that = and £2,1 be the event that = H^. Define the missed- 
detection event 



^1 = n ^1.^ 



and the false-alarm event 

^2 = U U 

The probability of decoding error for message m is then equal to ¥m{£-i^£-2), where as before denotes 
probability conditioned on message m being sent. 

We again define two auxiliary events describing the behavior of the IDC. Let £^3 be the event that the 
total number of symbols in x resulting from the first Nm transmitted symbols is outside {{iJii — 5)Nm, {1^2 + 
5)Nm), and let £4 be the event that the number of symbols in x resulting from symbols transmitted during 
time slots A^^ + 1 to Nm + Bm is outside ((/^i — 6)Bm, + S)Bm)- We can again upper bound the 
probability of error as 

Fm{£i u £2) < P™(^3) + F„(^4) + Pm(^i I £3 n £^) + P„(^2 1 £1 n £^). (21) 
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We start with the analysis of Pm(^3) and ¥jn{£4). Using Chebyshev's inequaUty together with the upper 
bound (T^ on the variance of the states s[t], we obtain similarly to the case with known value of /i 

^miSs) < e/4 (22) 

and 

Pm(i^4) < e/4 (23) 

for M large enough and for any value of G [/ii,/i2]- 

We proceed with the analysis of Fjn{£i\£^n£l) and Pr„(£2|^3 H^l). Conditioned on message m being 
sent and on n S^, the elements in the decision regions satisfy the following two properties for M large 
enough (not depending on m): 

1) For every £ e D^/ with m' ^ m, we have (x^, 1) = 0. Hence, 

1 

=(y,, 1) 

is Gaussian with mean zero and variance one. 

2) There exists at least one £ e T>m such that 

(x,, 1) > x*B-^/'{[{fii - S)Bm\ - Nj\og{M)). 
Hence, ^ 

is Gaussian with mean at least 

L(//i-(5)S„Jlog(M). 
and variance one. 

Property 2 can be proved using arguments analogous to the case with known value of /i. For property 1, 
we need to argue that the burst of symbols x* cannot be shifted into the incorrect decoding region. 

Assume first m' < m. The right-most element of "D^/ is at position less than or equal to (/X2+5) A^rn-i+l> 
and thus the right-most element of x^ with £ e is at position at most 

{H2 + S)Nm-l + iHl - 5)Bm-l < il^2 + (^)(A^m-l + ^m-l)- 

Now, conditioned on there are at least (/xi — 6)Nm symbols in x before the first symbol x*. For 
there to be no overlap, it is sufficient to argue that 

or, equivalently, that 

Nm>^^^^{Nm-l + Bm-l). 

This holds by the definition of A^^- 

Assume then that m' > m. The left-most element of any x^ with £ e Vm' is at position at least 
(/ii — 5)Nm+i + 1. Conditioned on S^, there are at most (^2 + 5)Nm symbols before the first symbol 
X* in X. Conditioned on £|, the burst of symbol in x is of length at most {112 + ^)-Bm- For there to be 
no overlap, it is sufficient that 

{112 + 5)N^ + {112 + 5)B^ < {ill - d)Nm+i, 



xW{,.-S-l/Bj (1 - ^._-o.(M) ) 
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or, equivalently, that 



fjLi-S 

This holds again by the definition of Nm+i- 

The two properties allow us to analyze the events £i and £2- By property 1, 



(y„l)> V(2 + <5)ln(M) 



^3 n 



M 



< 



Q(V(2 + 5)ln(M)) 



m'=l 



Using the Chemoff bound Qia) < exp(— a^/2) for the Q-function, 

g(V(2 + (5)ln(M)) < M-(^+^/2). 



Moreover, 



so that 



|p„,|<(^LZiil±|5)^ + l<0(log(M)) 



Af^/MAf)-! 

M 

J2\1^m'\<0{Mlog{M)) 



m'=l 



as M ^ 00. Hence, 



for M large enough. 

Consider then the value of £ e guaranteed by property 2. For this £, 



F^{£i I £1 n £',) < P^(£:i,, I £:3^ n £^) 

1 



< p, 



< g(xV(/^i-<^-i/5m)(i 



(y„ 1) < v/(2 + 5) ln(M) 

± y rn 



£1 n ^ 4" 



(24) 



L(/.i - 5)5^J log(M) 



V(2 + 5)ln(M)). 



Note that 

for m—1, and 
1 - 



1 - 



L(/ii - 5)S^J log(M) 



N, 



> 1 - 



L(//i - log(M) - ^ ((//I - 5){pi2 -pii + 26- 1/Nj - log(M) 

>l-o(l) 

as M — > 00 for m > 1. Furthermore, 

> - 0(1)) 

as M ^ 00. 
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By choosing 



x*^ {1 + 5)^/(2 + 6) ln(M)/(/ii-5), (25) 

we obtain 

F^Si I n ^4^) < q((<5 - o(l))v/(2 + 5)ln(M)j < £/4 (26) 

for M large enough. 

Substituting ([221), ([23]), ([24]), and ([261) into dlB shows that for M large enough the probability of 
decoding error is upper bounded by e for every message m. By (|20|) and (l25l) . the power required by this 
scheme is 

P = {x')^ 

= (1 + 6)^2 + 6) ln(M)/(^i -5). 

Hence, the achievable rate per unit cost for this scheme is 

P log(M) 

/ii - (5 



> 



;i + 5)2(2 + 5) ln(2)' 
Since 5 > can be made arbitrarily small, this shows that, for noise power i]'^ = 1, 

- 21n(2)' 

By scaling the input and output as in the proof of Theorem \T\ in Section |Vll this implies that 

- 27^2 ln(2) 

for any value of noise power 77^, concluding the proof. 
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